智能论文笔记

Space optimal and asymptotically move optimal Arbitrary Pattern Formation on rectangular grid by asynchronous robot swarm

Avisek Sharma , Satakshi Ghosh , Pritam Goswami , Buddhadeb Sau

分类：机器人

2022-12-06

Arbitrary pattern formation (\textsc{Apf}) is well studied problem in swarm robotics. The problem has been considered in two different settings so far; one is in plane and another is in infinite grid. This work deals the problem in infinite rectangular grid setting. The previous works in literature dealing with \textsc{Apf} problem in infinite grid had a fundamental issue. These deterministic algorithms use a lot space of the grid to solve the problem mainly because of maintaining asymmetry of the configuration or to avoid collision. These solution techniques can not be useful if there is a space constrain in the application field. In this work, we consider luminous robots (with one light that can take two colors) in order to avoid symmetry, but we carefully designed a deterministic algorithm which solves the \textsc{Apf} problem using minimal required space in the grid. The robots are autonomous, identical, anonymous and they operate in Look-Compute-Move cycles under a fully asynchronous scheduler. The \textsc{Apf} algorithm proposed in [WALCOM'2019] by Bose et al. can be modified using luminous robots so that it uses minimal space but that algorithm is not move optimal. The algorithm proposed in this paper not only uses minimal space but also asymptotically move optimal. The algorithm proposed in this work is designed for infinite rectangular grid but it can be easily modified to work in a finite grid as well.

translated by 谷歌翻译

Causes and Cures for Interference in Multilingual Translation

Uri Shaham , Maha Elbayad , Vedanuj Goswami , Omer Levy , Shruti Bhosale

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-14

Multilingual machine translation models can benefit from synergy between different language pairs, but also suffer from interference. While there is a growing number of sophisticated methods that aim to eliminate interference, our understanding of interference as a phenomenon is still limited. This work identifies the main factors that contribute to interference in multilingual machine translation. Through systematic experimentation, we find that interference (or synergy) are primarily determined by model size, data size, and the proportion of each language pair within the total dataset. We observe that substantial interference occurs mainly when the model is very small with respect to the available training data, and that using standard transformer configurations with less than one billion parameters largely alleviates interference and promotes synergy. Moreover, we show that tuning the sampling temperature to control the proportion of each language pair in the data is key to balancing the amount of interference between low and high resource language pairs effectively, and can lead to superior performance overall.

translated by 谷歌翻译

XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning

Pritam Sarkar , Ali Etemad

分类：计算机视觉

2022-11-25

We present XKD, a novel self-supervised framework to learn meaningful representations from unlabelled video clips. XKD is trained with two pseudo tasks. First, masked data reconstruction is performed to learn modality-specific representations. Next, self-supervised cross-modal knowledge distillation is performed between the two modalities through teacher-student setups to learn complementary information. To identify the most effective information to transfer and also to tackle the domain gap between audio and visual modalities which could hinder knowledge transfer, we introduce a domain alignment strategy for effective cross-modal distillation. Lastly, to develop a general-purpose solution capable of handling both audio and visual streams, a modality-agnostic variant of our proposed framework is introduced, which uses the same backbone for both audio and visual modalities. Our proposed cross-modal knowledge distillation improves linear evaluation top-1 accuracy of video action classification by 8.4% on UCF101, 8.1% on HMDB51, 13.8% on Kinetics-Sound, and 14.2% on Kinetics400. Additionally, our modality-agnostic variant shows promising results in developing a general-purpose network capable of handling different data streams. The code is released on the project website.

translated by 谷歌翻译

Bounding Box Priors for Cell Detection with Point Annotations

Hari Om Aggrawal , Dipam Goswami , Vinti Agarwal

分类：计算机视觉

2022-11-11

The size of an individual cell type, such as a red blood cell, does not vary much among humans. We use this knowledge as a prior for classifying and detecting cells in images with only a few ground truth bounding box annotations, while most of the cells are annotated with points. This setting leads to weakly semi-supervised learning. We propose replacing points with either stochastic (ST) boxes or bounding box predictions during the training process. The proposed "mean-IOU" ST box maximizes the overlap with all the boxes belonging to the sample space with a class-specific approximated prior probability distribution of bounding boxes. Our method trains with both box- and point-labelled images in conjunction, unlike the existing methods, which train first with box- and then point-labelled images. In the most challenging setting, when only 5% images are box-labelled, quantitative experiments on a urine dataset show that our one-stage method outperforms two-stage methods by 5.56 mAP. Furthermore, we suggest an approach that partially answers "how many box-labelled annotations are necessary?" before training a machine learning model.

translated by 谷歌翻译

Delay Embedded Echo-State Network: A Predictor for Partially Observed Systems

Debdipta Goswami

分类：机器学习

2022-11-11

This paper considers the problem of data-driven prediction of partially observed systems using a recurrent neural network. While neural network based dynamic predictors perform well with full-state training data, prediction with partial observation during training phase poses a significant challenge. Here a predictor for partial observations is developed using an echo-state network (ESN) and time delay embedding of the partially observed state. The proposed method is theoretically justified with Taken's embedding theorem and strong observability of a nonlinear system. The efficacy of the proposed method is demonstrated on three systems: two synthetic datasets from chaotic dynamical systems and a set of real-time traffic data.

translated by 谷歌翻译

Pitfalls of Climate Network Construction: A Statistical Perspective

Moritz Haas , Bedartha Goswami , Ulrike von Luxburg

分类：机器学习 | (统计)机器学习

2022-11-05

Network-based analyses of dynamical systems have become increasingly popular in climate science. Here we address network construction from a statistical perspective and highlight the often ignored fact that the calculated correlation values are only empirical estimates. To measure spurious behaviour as deviation from a ground truth network, we simulate time-dependent isotropic random fields on the sphere and apply common network construction techniques. We find several ways in which the uncertainty stemming from the estimation procedure has major impact on network characteristics. When the data has locally coherent correlation structure, spurious link bundle teleconnections and spurious high-degree clusters have to be expected. Anisotropic estimation variance can also induce severe biases into empirical networks. We validate our findings with ERA5 reanalysis data. Moreover we explain why commonly applied resampling procedures are inappropriate for significance evaluation and propose a statistically more meaningful ensemble construction framework. By communicating which difficulties arise in estimation from scarce data and by presenting which design decisions increase robustness, we hope to contribute to more reliable climate network construction in the future.

translated by 谷歌翻译

Mutual Information and Ensemble Based Feature Recommender for Renal Cancer Stage Classification

Abhishek Dey , Debayan Goswami , Rahul Roy , Susmita Ghosh , Yu Shrike Zhang , Jonathan H. Chan

分类：机器学习

2022-09-28

肾脏是人体的重要器官。它保持体内平衡并通过尿液去除有害物质。肾细胞癌（RCC）是肾癌最常见的形式。大约90％的肾脏癌归因于RCC。最有害的RCC类型是清晰的细胞肾细胞癌（CCRCC），占所有RCC病例的80％。需要早期和准确的CCRCC检测，以防止其他器官进一步扩散该疾病。在本文中，进行了详细的实验，以确定可以在不同阶段诊断CCRCC的重要特征。 CCRCC数据集从癌症基因组图集（TCGA）获得。考虑了从8种流行特征选择方法获得的特征顺序的新型相互信息和集合的特征排名方法。通过使用2个不同的分类器（ANN和SVM）获得的总体分类精度来评估所提出方法的性能。实验结果表明，所提出的特征排名方法能够获得更高的精度（分别使用SVM和NN分别使用SVM和NN），与现有工作相比，使用SVM和NN分别使用SVM和NN进行分类。还要注意的是，在现有TNM系统（由AJCC和UICC提出的）提到的3个区分特征中，我们提出的方法能够选择其中两个（肿瘤的大小，转移状态）作为顶部 - 大多数。这确立了我们提出的方法的功效。

translated by 谷歌翻译

Concordance based Survival Cobra with regression type weak learners

Rahul Goswami , Arabin Kumar Dey

分类： (统计)机器学习 | 人工智能 | 机器学习

2022-09-24

在本文中，我们通过合并的回归策略来预测条件生存函数。我们将弱的学习者视为不同的随机生存树。我们建议在右审查设置中最大化和解以找到最佳参数。我们探索两种方法，一种通常的生存眼镜蛇和基于一致性指数的新型加权预测指标。我们提出的配方使用两种不同的规范，例如Max-Norm和Frobenius Norm，从测试数据集中的查询点找到了一组邻近性预测。我们通过三个不同的现实数据集实现来说明我们的算法。

translated by 谷歌翻译

Control Barrier Functions in UGVs for Kinematic Obstacle Avoidance: A Collision Cone Approach

Phani Thontepu , Bhavya Giri Goswami , Neelaksh Singh , Shyamsundar P I , Shyam Sundar M G , Suresh Sundaram , Vaibhav Katewa , Shishir Kolathaya.

分类：机器人

2022-09-23

在本文中，我们提出了针对无人接地车辆（UGV）的新的控制屏障功能（CBF），该功能有助于避免与运动学（非零速度）障碍物发生冲突。尽管当前的CBF形式已经成功地保证了与静态障碍物的安全/碰撞避免安全性，但动态案例的扩展已获得有限的成功。此外，借助UGV模型，例如Unicycle或自行车，现有CBF的应用在控制方面是保守的，即在某些情况下不可能进行转向/推力控制。从经典的碰撞锥中汲取灵感来避免轨迹规划，我们介绍了其新颖的CBF配方，并具有对独轮车和自行车模型的安全性保证。主要思想是确保障碍物的速度W.R.T.车辆总是指向车辆。因此，我们构建了一个约束，该约束确保速度向量始终避开指向车辆的向量锥。这种新控制方法的功效在哥白尼移动机器人上进行了实验验证。我们将其进一步扩展到以自行车模型的形式扩展到自动驾驶汽车，并在Carla模拟器中的各种情况下证明了避免碰撞。

translated by 谷歌翻译

Computer vision based vehicle tracking as a complementary and scalable approach to RFID tagging

Pranav Kant Gaur , Abhilash Bhardwaj , Pritam Shete , Mohini Laghate , Dinesh M Sarode

分类：计算机视觉

2022-09-13

传入/传出车辆的记录是根本原因分析的关键信息，以打击各种敏感组织中的安全违规事件。 RFID标记会阻碍物流和技术方面的车辆跟踪解决方案的可扩展性。例如，要求标记为RFID的每个传入车辆（部门或私人）是严重的限制，并且与RFID一起检测异常车辆运动的视频分析是不平凡的。我们利用公开可用的计算机视觉算法实现，使用有限状态机形式主义开发可解释的车辆跟踪算法。国家机器将用于状态转换的级联对象检测和光学特征识别（OCR）模型中的输入。我们从系统部署站点中评估了75个285辆车的视频片段中提出的方法。我们观察到检测率受速度和车辆类型的影响最大。当车辆运动仅限于在检查点类似于RFID标记的检查点时，将达到最高的检测率。我们进一步分析了700个对Live DATA的车辆跟踪预测，并确定大多数车辆数量预测误差是由于无法辨认的文本，图像布鲁尔，文本遮挡，文本遮挡和vecab外字母引起的。为了进行系统部署和性能增强，我们希望我们正在进行的系统监控能够提供证据，以在安全检查点上建立更高的车辆通知SOP，并将已部署的计算机视觉模型和状态模型的微调驱动为建立拟议的方法作为RFID标记的有希望的替代方法。

translated by 谷歌翻译